Statistics theory

image-20200804110715498

Normal distribution written in exponential family form

image-20200808230114972

Ancillary precision (how ancillary statistics helps when estimating $\theta$)

image-20200808232204111

image-20200808232220253

image-20200808232234866

image-20200808232534895

Moments of normal distribution

image-20200719230938507

image-20200803145112498

chebyshev’s inequality

image-20200720115229399

Conditional distribution of multinormal distribution

image-20200720002553353

Exponential distribution, gamma distribution and chi-square distribution

image-20200722000304062

Multiparameters MLE

image-20200723124721954

image-20200723124731745

image-20200723124632235

image-20200723124700323

image-20200723124850893

image-20200723124926761

image-20200723125002238

If we have some restrictions on $\theta$, say $\theta > 0$ ,

image-20200723135157947

MLE may not be unique

image-20200723135637145

MLE in exponential family

image-20200723142716742

image-20200723142825756

Bayes estimator

Intuition: If $\theta$ is fixed, we want to find a estimator to estimate $\theta$ such that its mse is minimized. Since due to bayesian point of view, our goal becomes:

image-20200723145713803

image-20200723145814434

image-20200723150147895

Proving the crammer rao lower bound

image-20200723221757963

image-20200723221830197

image-20200723221902753

image-20200723221936751

UMVUE in normal distribution

image-20200723222956747

image-20200723223139060

Situations that the key condition for crammer-rao lower bound is violated.

image-20200723224057735

image-20200723230455565

image-20200723230523126

UMVUE is uncorrelated with all estimators which are unbiased to 0

image-20200724223040517

image-20200724223107646

image-20200724223155943

An alternative to Theorem 7.3.20 is that

image-20200725113118840

image-20200725113202499

Example that UMVUE does not exist

image-20200724224008038

image-20200724224501738

image-20200724224657895

image-20200724224721455

image-20200724224939037

Lehmann-Scheffe Theorem

image-20200724230439342

image-20200726124827821

FInd a complete statistics in uniform distribution

image-20200725020418457

image-20200725020626084

If we are estimating $g(\theta)$ instead of $\theta$,

image-20200725115905701

image-20200725120010372

image-20200725020832440

To show this, intuitively, we can go back to image-20200725020946521

Here, we take derivative to $\theta>1$, we can only get the result that $g(t) = 0$ for all $t>1$.

there exsits many choices of $g(t)$ to make $g(t) \neq 0$ when $0<t<1$.

image-20200725113944069

image-20200725114305007

image-20200725114445953

image-20200725114816359

UMVUE in poisson family

image-20200725120956565

image-20200725121035951

Prove S is not unbiased for sigma

image-20200725162641622

image-20200725162707994

image-20200725162730638

UMVUE in normal family

image-20200725162818544

image-20200725162919977

UMVUE is exponential distribution family

image-20200725163953882

image-20200725165010924

image-20200729105238851

UMVUE in simple linear regression

image-20200725213847264

image-20200725213955112

image-20200725220343603

image-20200725220450337

image-20200725220515829

image-20200725220551988

image-20200725220858319

image-20200725221500193

image-20200725221529577

UMP test using Neyman-Pearson lemma

image-20200727145648886

image-20200727145716157

image-20200727152539567

image-20200727152612318

image-20200727152917412

Existence of UMP without monotone likelihood ratio

image-20200727214457258

image-20200727214517705

image-20200727214541828

How to show the UMP test does not exist

image-20200727215052207

image-20200727215128059

UMPU test in exponential family

image-20200727230331011

Definition of the p-value

image-20200727231203741

Definition of confidence coefficient

image-20200727233541210

Confidence sets by inverting test statistics

Inverting the acceptance region of the test will give us the confidence set.

Inverting the acceptance region of the UMP test will give us the UMA confidence interval.

Inverting the acceptance region of the UMPU test will give us the UMAU confidence interval.

image-20200728113548862

Pivoting cdf’s

image-20200728110924052

image-20200728112039598

image-20200728112055489

Inverting test to obtain confidence intervals

  • Invert the acceptance region of a level $\alpha$ test. This test can be one sided test, one sided UMP test or LRT or two-sided test, two-sided UMPU test.

    image-20200804213306450

    image-20200804213312747

  • Construct a one-sided test if you only want a lower or upper bound. For example, construct $H_0:\theta = \theta_0$ $H_1:\theta>\theta_0$, will give a lower bound for $\theta$ if it has an MLR.

  • Intuitively, when when you are testing the above hypothesis, when T is given and $\theta_0$ is unknown, you can try different $\theta_0$ and to see if T still falls in the acceptance region. Naturally, you can push $\theta_0$ to the leftmost point until at a point the $T$ falls into the rejection region(which is determined by $\theta_0$). That’s why it also requires MLR holds.

  • Example:

    image-20200804205855240

    image-20200804205914639

This family has MLR obviously.

here m(p) is like the leftmost point of the rejection region, so y should always fall into the acceptance region, which satisfy that $y \leq m(p)$ holds at any time.

Pivotal quantity to obtain the confidence interval

  • Pivot a cdf.

    image-20200804212616161

    image-20200804212628012

Shortest length confidence interval

image-20200804214230382

consistent estimator

image-20200804220715895

  • Example of evaluating directly:

    image-20200804220941931

  • Example of a trick constructing the consistent estimator:

    image-20200805000703607

    image-20200805000727092

    The last step uses the slutsky’s theorem, which I once wrongly thought it’s continuous mapping, theorem, so pay attention to that.

####

Consistency of MLE

image-20200805002229580

MLE is asymptotically efficient

Which means its asymptotic variance reaches the CR-lower bound.

Asymptotic relative efficiency

image-20200805110111298

image-20200805110132223

Asymptotic test

image-20200805111736101

image-20200805111807190

Wald test

image-20200805112044150

Score test

image-20200805112643109

Wald test, Score test theorem

image-20200805112730780

image-20200805112805366

image-20200805113132298

Asymptotic confidence sets

  • Asymptotic pivotal quantity:

    Based on CLT and slutsky’s theorem

    image-20200805120558239

    Here, the reason why we can substitute $\sigma^2$ to $S^2$ Is that $S^2$ converges to $\sigma^2$ almost surely.

Qualify exam 2019

image-20200801160026997

  • (a): write the pdf in the exponential family form

    image-20200801171447814

  • (d): use lehmann scheffe theorem, find a statistics that is a function of the css that is unbiased of $\theta^2$

  • (e): to calculate ARE, we need to calculate the limiting variance of both estimators. For $X_1/n$, CLT can be used. For MLE, the limiting variance will attain the CR-lower bound, which is $1/I(\theta)$, 要注意的是with respect to 后面的东西放在分子上。

  • (f): apply delta method.

image-20200801172858913

  • (a): apply CLT directly

  • (b): 把$Y_i - \hat{\mu}$ 拆成$Y_i - \mu + \mu - \hat{\mu}$, 再用slutsky’s theorem。

  • (c): 在(b)的基础上替换$\hat{\sigma^3}$。

  • 注意使用LLN的条件只需要E(X)存在,使用CLT的条件还需要方差有限。

  • 两个以概率收敛的项相加,得到的也是以概率收敛的。

Qualify exam 2018

image-20200801181448415

  • (c): Find CSS, and try to do some transformation and make it unbiased. Here our CSS is $log(X_1\cdots X_n)$, we already know that $\bar{X}$ is unbiased for $\theta$, so naturally, we would like to try transformation $e^{log(X_1 \cdots X_n)/n}$。To ease calculation, we can first derive the distribution of $log(X_1 \cdots X_n) / n$, then do exp transformation.
  • (d): MSE = bias^2 + variance, and UMVUE is unbiased.
  • (e): MLE is asymptotically normal, its limiting variance can be obtained by fisher’s information. The other one’s limiting variance requires CLT and delta’s method.
  • (f): for $\bar{X}$, use CLT to calculate its limiting variance

image-20200801214106104

  • (a): image-20200811223735647
  • (b) remeber image-20200801222048063
  • (b) part (iii): 2log($\frac{H_1}{H_0}$) follows $\chi$ (difference of df), here, it is p(p+1)/2.

Qualify exam 2017

image-20200801222516542

image-20200801222532011

  • (a) keep in mind 分部积分
  • (c) 可以直接用$\theta$是mle的性质,用fisher information计算limiting variance,再用delta method计算$\gamma$的 limiting variance. 也可以不用这个性质,利用CLT和delta method直接计算统计量的limiting variance。
  • (d) Wald test, based on (c)问计算得到的mle的limiting variance。image-20200801223737558

image-20200801224215640

  • (a) (iii): asymptotic normal distribution sometimes gives a asymptotic test and confidence interval

  • (b) (i) the key is to write

    image-20200802011623299

    Qualify exam 2016

    image-20200802011806930

  • (b): bayes estimator under squared error loss is the expectation of the posterior distribution of $\theta$.

  • (c): prove the family has an MLR on a test statistic. Here two important properties.

    1. x follow a beta distribution, so -log(x) follows a exponential distribution
    1. Sum of -log(xi) follows Gamma(n,$\theta$).
  • (e): transform from Gamma(n,$\theta$) to Gamma(n,1)
  • image-20200802014203649

image-20200802014541205

  • (a) the key is to show $\hat{\sigma}$ converges in probability to $\sigma$ by continuous mapping and slutsky’s theorem.

Qualify exam 2012

image-20200802152621486

  • Same problem like the qualify exam 2019

image-20200802152741111

  • (a) write the pdf to be an exponential family.
  • (b) transformation of the css gives us the UMVUE by Lehmann-Scheffe.

Qualify exam 2013

image-20200820133121143

  • (a) use the property of MLE, derive the fisher information of $\theta$

  • (c) 当theta的dimension小于pi的时候,证明不止有sum theta = 1这一个constraint,所以$\hat{\pi_2}$ 不是MLE,其方差肯定比MLE更大。

image-20200820145314488

image-20200820145330588

image-20200820145358431

image-20200820145427662

image-20200802224219785

  • (a): when encountering E(AB), we can express it into E(E(AB|A)).

image-20200820110513513

  • (a) use central limit theorem to obtain the asymptotic distribution of $\hat{p_1}-\hat{p_2}$, and use slutsky’s theorem by showing the $\sqrt{2\hat{p}\hat{q}}$ converges in probability to $2p-2p^2$, where $p_1=p_2=p$. When $p_1 \neq p_2$, rearrange the terms inside the P() to match the asymptotic distribution of $\hat{p_2}-\hat{p_1}$.

image-20200802231206636

  • (a): Use Jacobian determination.

Qualify exam 2014

image-20200807112351614

  • (b): when calculating the CR-lower bound for $f(\theta)$, the formula is $f^{‘}(\theta)^2/I(\theta)$
  • (c): $X_nY_n$ also follows a beta distribution.

image-20200807112706077

  • (b) first consider the asymptotic distribution of image-20200807112857371 , then notice that min is a continuous function, so delta’s method can be applied.

image-20200807112944915

  • (b) consider $P(X_{n+1} = 1|X_1 = 1) = P(X_{n+1} = 1,X_n = 0| X_1 = 1)+P(X_{n+1} = 1,X_n = 1| X_1 = 1) $

    And $P(X_{n+1} = 1,X_n = 0| X_1 = 1) = P(X_{n+1}=1|X_n=0)*P(X_n=0|X_1=1)$

  • (c) (i) 用递归式推导出$\theta_n$

  • (c) (ii) 证明Cov($X_a$,$X_b$) > 0 by writing it to be $E(X_aX_b) - E(X_a)E(X_b)$ and the first term is $E(E(X_aX_b|X_a)) = E(X_a E(X_b|X_a)) = P(X_a = 1)P(X_b = 1| X_a = 1)$

  • (d) write down the transition probability matrix. P(X_n = 0 or 1|X_{n-1} = 0 or 1).

Qualify exam 2015

image-20200807171803403

  • (a):

    SIjia fang’s solution:

    image-20200808104639271

  • (b): use the three dimension central limit theorem and apply delta’s method. Another way is to write 根号n乘分子以分布收敛于正态,分母的两项全部以概率收敛到常数。

QE sample 1

image-20200817001009675

  • (f) UMPU

image-20200817001131282

QE sample 2

image-20200817000603205

  • (a) notice that X1-X2 is symmetric between 0.
  • (c) exponential, gamma, chi-square transformations
  • (f) first calculate the pdf of the conditional probability obtained by RB theorem, then do the integral.

image-20200817000921022

STAT609 2019 final

image-20200807122429437

  • (a): use $P(Y \leq y)$.

  • (b): they are not since $P(Y|X) \neq P(Y)$.

  • (c): cov(X,Y) = E(XY) - E(X)E(Y), E(XY) 用二元积分求解,这里注意给定x以后,y是个point mass.

STAT610 2020 final

image-20200803210614429

  • Show its mean = 0 and limiting variance equals 0.

image-20200803210726452

  • (a) show $T_n$ is MLE, and MLE is asymptotic normal and consistent.
  • (b) MLE attains the CR-lower bound.

image-20200803210936558

  • (a) show $\hat{\sigma^2}$ is a function of $\bar{X_i}$ and $\bar{X_i^2}$.

image-20200803211218742

  • (b) write $a \leq pivot \leq b$, then show the length of the CI is proportional to $b/a$. Finally minimize b/a given $\int_a^b f(x)dx = 1-\alpha$

image-20200803213501503

  • (a) write $M_n = F^{-1}(W_n)$ and $M = F^{-1}(1/2)$, based on image-20200803214942300 Apply delta’s method.

  • (b) just calculates $M_n - M$ and show it converges to 0。

  • (c) ARE’s definition:
  • image-20200803223349088
  • (d) sample mean is more efficient since it is MLE.
  • $M_n$ Is more efficient since it is MLE.

STAT610 2019

image-20200803225520645

  • (b) one parameter exponential family,

image-20200803230546898

image-20200803231027855

  • (c): image-20200803232420520

image-20200804005340794

  • No, if they are from a Poisson distribution. When $\bar{X}$ becomes larger, the $S^2$ becomes larger.
  • Note that under normality assumption and if Xi are iid, the two are independent.
  • E($S^2$) = $\sigma^2$ is true under iid assumption. Proving this is by using adding and subtracting mean principle.

image-20200804102413933

image-20200804102426873

  • The rejection region has the form $\lambda < c$, since -2log($\lambda$) has a known distribution, we can then write the rejection region to be {x: -2log($\lambda$) > $c^*$}

image-20200804103129765

image-20200804103230475